AITopics | kernel density estimate

Collaborating Authors

kernel density estimate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix 602 A Design of LSH Functions: 603 In practice, we could realize d (x, y)

Neural Information Processing SystemsFeb-9-2026, 20:37:58 GMT

As shown in Definition A.1, the SRP hash is an LSH function built upon random Gaussian projections, The SRP hash is usually designed for dense vectors. Following Definition A.2, MinHash serves as a powerful tool for Jaccard similarity estimation [ In this paper, we take a kernel view of the collision probability of LSH (see Definition 3.2). In this section, we introduce the client selection algorithm with our one-pass sketch. To formally prove the Theorems in the paper. Next, we introduce the formal statements in the paper as below.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining (0.56)
Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

A General Details about the Architecture A.1 Hyperparameters and How to Set Them

Neural Information Processing SystemsAug-14-2025, 16:52:11 GMT

In python, these are like decorators.

dataset, hyperparameter, neural interpreter, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Consistent and Differentiable L p Canonical Calibration Error Estimator

Neural Information Processing SystemsAug-14-2025, 05:03:40 GMT

Calibrated probabilistic classifiers are models whose predicted probabilities can directly be interpreted as uncertainty estimates.

calibration, calibration error, estimator, (16 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback

Generating Synthetic Data with Formal Privacy Guarantees: State of the Art and the Road Ahead

Schlegel, Viktor, Bharath, Anil A, Zhao, Zilong, Yee, Kevin

arXiv.org Artificial IntelligenceMar-26-2025

Privacy-preserving synthetic data offers a promising solution to harness segregated data in high-stakes domains where information is compartmentalized for regulatory, privacy, or institutional reasons. This survey provides a comprehensive framework for understanding the landscape of privacy-preserving synthetic data, presenting the theoretical foundations of generative models and differential privacy followed by a review of state-of-the-art methods across tabular data, images, and text. Our synthesis of evaluation approaches highlights the fundamental trade-off between utility for down-stream tasks and privacy guarantees, while identifying critical research gaps: the lack of realistic benchmarks representing specialized domains and insufficient empirical evaluations required to contextualise formal guarantees. Through empirical analysis of four leading methods on five real-world datasets from specialized domains, we demonstrate significant performance degradation under realistic privacy constraints ($\epsilon \leq 4$), revealing a substantial gap between results reported on general domain benchmarks and performance on domain-specific data. %Our findings highlight key challenges including unaccounted privacy leakage, insufficient empirical verification of formal guarantees, and a critical deficit of realistic benchmarks. These challenges underscore the need for robust evaluation frameworks, standardized benchmarks for specialized domains, and improved techniques to address the unique requirements of privacy-sensitive fields such that this technology can deliver on its considerable potential.

knowledge management, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2503.20846

Country:

Asia > Singapore (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.86)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(4 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Knowledge Management (1.00)
Information Technology > Information Management (1.00)
(9 more...)

Add feedback

Spectral Toolkit of Algorithms for Graphs: Technical Report (2)

Macgregor, Peter, Sun, He

arXiv.org Artificial IntelligenceJun-6-2024

Spectral Toolkit of Algorithms for Graphs (STAG) is an open-source C++ and Python library providing several methods for working with graphs and performing graph-based data analysis. In this technical report, we provide an update on the development of the STAG library. The report serves as a user's guide for the newly implemented algorithms, and gives implementation details and engineering choices made in the development of the library. The report is structured as follows: Section 2 describes the locality sensitive hashing, and the main components used in its construction. Section 3 describes the kernel density estimation, and the state-of-the-art algorithm for the kernel density estimation.

algorithm, kernel density, stag, (13 more...)

arXiv.org Artificial Intelligence

2407.07096

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Dimensionality Reduction and Dynamical Mode Recognition of Circular Arrays of Flame Oscillators Using Deep Neural Network

Xu, Weiming, Yang, Tao, Zhang, Peng

arXiv.org Artificial IntelligenceDec-13-2023

Oscillatory combustion in aero engines and modern gas turbines often has significant adverse effects on their operation, and accurately recognizing various oscillation modes is the prerequisite for understanding and controlling combustion instability. However, the high-dimensional spatial-temporal data of a complex combustion system typically poses considerable challenges to the dynamical mode recognition. Based on a two-layer bidirectional long short-term memory variational autoencoder (Bi-LSTM-VAE) dimensionality reduction model and a two-dimensional Wasserstein distance-based classifier (WDC), this study proposes a promising method (Bi-LSTM-VAE-WDC) for recognizing dynamical modes in oscillatory combustion systems. Specifically, the Bi-LSTM-VAE dimension reduction model was introduced to reduce the high-dimensional spatial-temporal data of the combustion system to a low-dimensional phase space; Gaussian kernel density estimates (GKDE) were computed based on the distribution of phase points in a grid; two-dimensional WD values were calculated from the GKDE maps to recognize the oscillation modes. The time-series data used in this study were obtained from numerical simulations of circular arrays of laminar flame oscillators. The results show that the novel Bi-LSTM-VAE method can produce a non-overlapping distribution of phase points, indicating an effective unsupervised mode recognition and classification. Furthermore, the present method exhibits a more prominent performance than VAE and PCA (principal component analysis) for distinguishing dynamical modes in complex flame systems, implying its potential in studying turbulent combustion.

artificial intelligence, diffusion flame, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2312.02462

Country: Asia > China > Hong Kong (0.16)

Genre:

Research Report > Promising Solution (0.88)
Research Report > New Finding (0.87)

Industry:

Energy > Oil & Gas (0.48)
Health & Medicine (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generalized Oversampling for Learning from Imbalanced datasets and Associated Theory

Stocksieker, Samuel, Pommeret, Denys, Charpentier, Arthur

arXiv.org Artificial IntelligenceAug-5-2023

In supervised learning, it is quite frequent to be confronted with real imbalanced datasets. This situation leads to a learning difficulty for standard algorithms. Research and solutions in imbalanced learning have mainly focused on classification tasks. Despite its importance, very few solutions exist for imbalanced regression. In this paper, we propose a data augmentation procedure, the GOLIATH algorithm, based on kernel density estimates which can be used in classification and regression. This general approach encompasses two large families of synthetic oversampling: those based on perturbations, such as Gaussian Noise, and those based on interpolations, such as SMOTE. It also provides an explicit form of these machine learning algorithms and an expression of their conditional densities, in particular for SMOTE. New synthetic data generators are deduced. We apply GOLIATH in imbalanced regression combining such generator procedures with a wild-bootstrap resampling technique for the target values. We evaluate the performance of the GOLIATH algorithm in imbalanced regression situations. We empirically evaluate and compare our approach and demonstrate significant improvement over existing state-of-the-art techniques.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.02966

Country:

Oceania > Australia > Western Australia (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Finite-Sample Symmetric Mean Estimation with Fisher Information Rate

Gupta, Shivam, Lee, Jasper C. H., Price, Eric

arXiv.org Artificial IntelligenceJun-28-2023

The mean of an unknown variance-$\sigma^2$ distribution $f$ can be estimated from $n$ samples with variance $\frac{\sigma^2}{n}$ and nearly corresponding subgaussian rate. When $f$ is known up to translation, this can be improved asymptotically to $\frac{1}{n\mathcal I}$, where $\mathcal I$ is the Fisher information of the distribution. Such an improvement is not possible for general unknown $f$, but [Stone, 1975] showed that this asymptotic convergence $\textit{is}$ possible if $f$ is $\textit{symmetric}$ about its mean. Stone's bound is asymptotic, however: the $n$ required for convergence depends in an unspecified way on the distribution $f$ and failure probability $\delta$. In this paper we give finite-sample guarantees for symmetric mean estimation in terms of Fisher information. For every $f, n, \delta$ with $n > \log \frac{1}{\delta}$, we get convergence close to a subgaussian with variance $\frac{1}{n \mathcal I_r}$, where $\mathcal I_r$ is the $r$-$\textit{smoothed}$ Fisher information with smoothing radius $r$ that decays polynomially in $n$. Such a bound essentially matches the finite-sample guarantees in the known-$f$ setting.

artificial intelligence, log 1, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.16573

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Dimensionality Reduction for General KDE Mode Finding

Luo, Xinyu, Musco, Christopher, Widdershoven, Cas

arXiv.org Artificial IntelligenceJun-1-2023

Finding the mode of a high dimensional probability distribution $D$ is a fundamental algorithmic problem in statistics and data analysis. There has been particular interest in efficient methods for solving the problem when $D$ is represented as a mixture model or kernel density estimate, although few algorithmic results with worst-case approximation and runtime guarantees are known. In this work, we significantly generalize a result of (LeeLiMusco:2021) on mode approximation for Gaussian mixture models. We develop randomized dimensionality reduction methods for mixtures involving a broader class of kernels, including the popular logistic, sigmoid, and generalized Gaussian kernels. As in Lee et al.'s work, our dimensionality reduction results yield quasi-polynomial algorithms for mode finding with multiplicative accuracy $(1-\epsilon)$ for any $\epsilon > 0$. Moreover, when combined with gradient descent, they yield efficient practical heuristics for the problem. In addition to our positive results, we prove a hardness result for box kernels, showing that there is no polynomial time algorithm for finding the mode of a kernel density estimate, unless $\mathit{P} = \mathit{NP}$. Obtaining similar hardness results for kernels used in practice (like Gaussian or logistic kernels) is an interesting future direction.

artificial intelligence, kernel, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2305.18755

Country:

North America > United States > New York (0.04)
North America > United States > Indiana (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.83)

Add feedback

Understanding Mean Shift Clustering(Artficial Intelligence)

#artificialintelligenceJul-10-2022, 16:15:33 GMT

Abstract: In this study, a novel method for the construction of a driving cycle based on Mean Shift clustering is proposed to solve the problems existing in the traditional micro-trips method. Firstly, 1701 kinematic segments are obtained by processing and dividing the driving data in real road conditions. Secondly, 12 kinematic parameters are calculated for each segment, and the dimensionality of parameters is reduced through principal component analysis (PCA). Three principal components are chosen to classify all cycles into three types by the Mean Shift algorithm. Finally, according to the principle of minimum deviation, representative micro-trips are selected from each type of cycle to complete the construction of the final driving cycle.

algorithm, artficial intelligence, construction method, (9 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.39)

Add feedback